US -PAT -NO: 



6285995 



DOCUMENT-IDENTIFIER: US 6285995 Bl 

TITLE: Image retrieval system using 

a query image 

DATE-ISSUED: September 4, 2001 

INVENTOR-INFORMATION: 

NAME CITY 

STATE ZIP CODE COUNTRY 

Abdel-Mottaleb; Mohammed S. Ossining 

NY N/A N/A 

Krishnamachari; Santhana Ossining 

NY N/A N/A 

APPL-NO: 09/ 102474 

DATE FILED: June 22, 1998 



US-CL-CURRENT: 707/3, 707/2 , 707/6 
ABSTRACT : 

An image retrieval system contains a database 
with a large number of images. 

The system retrieves images from the database that 
are similar to a query image 

entered by the user. The images in the database 
are grouped in clusters 

according to a similarity criterion so that 
mutually similar images reside in 
the same cluster. Each cluster has a cluster 
center which is representative 

for the images in it. A first step of the search 
to similar images selects the 

clusters that may contain images similar with the 
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query image, by comparing the 

query image with the cluster centers of all 

clusters. A second step of the 

search compares the images in the selected clusters 
with the query image in 

order to determine their similarity with the query 
image . 

10 Claims, 7 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 4 
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Brief Summary Text - BSTX (5) : 

An image retrieval system and a method as 
described above, are known from 

the article "Tools and Techniques for Color Image 
Retrieval " , John R. Smith and 

Shih-Fu Chang, Proc. SPIE — Int. Soc. Opt. Eng 
(USA) , Vol. 2670, pp. 

426-437. The image retrieval system comprises a 
database with a large number 

of images . A user searching for a particular image 
specifies a query image as 

to how the retrieved image or images should look 
like. Then the system 

compares the stored images with the query image and 
ranks the stored images 

according to their similarity with the query image. 

The ranking results are 
presented to the user who may retrieve one or more 
of the images . The 

comparison of the query image with a stored image 
to determine the similarity 

may be based, on a number of features derived from 
the respective images. The 
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image feature or features used for comparison are 
called a feature vector. The 

article describes the usage of a color histogram as 
such a feature vector. 

When using the RGB (Red, Green and Blue) 
representation of an image, a color 
histogram is computed by quantizing the colors 
within the image and counting 

the number of pixels of each color. To determine 
the similarity, a number of 

techniques are described to compare the two color 
histograms of the respective 

images. An example of such technique is the 

histogram intersection, where the 

similarity is the sum over all histogram bins of 

the minimal value of the pair 

of corresponding bins of the two histograms. 

Detailed Description Text - DETX (41) : 

In the embodiments of the image retrieval system 
described above, a single 

color histogram is made from the whole image. 

Because of this, the spatial 

information from the image is lost and the 

comparison of two images reflects 

only global similarity. For example if a user 

enters a query image with a sky 

at the top and sand at the bottom, the retrieved 
images are expected to have a 

mix of blue and beige, but not necessarily a sky 
and sand. A desirable result 

for the retrieved candidate images would be images 
with blue at the top and 

beige at the bottom. In order to achieve this 
result, a further embodiment of 

the system according to the invention determines a 
color histogram for a number 

of respective regions of the query image and 
compares these determined 

histograms with histograms of corresponding regions 
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of the candidate image. 

The query image may be divided into regions using 
prefixed boundaries, e.g. the 

division of the image into a number of rectangles. 
Furthermore, the regions 

may be indicated manually by the user taking into 
account important objects in 

the query image. In this way, the user forces that 
a histogram is made for a 

region comprising the object of interest. The 
choice of the region size is 

important since it governs the emphasis that is 
given to local information. In 

one extreme, the whole image is considered as a 
single region so that only 

global information is used for the comparison. In 
the other extreme, the 

region size matches the individual pixels. In one 
of the further embodiments 

of the retrieval system according to the invention, 
the images are divided into 
4. times. 4 rectangular regions. 
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ABSTRACT : 

In an image retrieval system, a database with a 
large number of images is 

searched to find one or more images meeting the 
specification of a user. This 
specification is given in the form of a query 
image. The system determines the 

similarity between the query image and a particular 
image from the database by 

comparing the color histograms of the two images. 
The histograms are treated 

as statistical distributions and the similarity is 
determined on the basis of 
an information theoretic measure of the 
distributions. In a first embodiment, 



07/10/2004, EAST Version: 1.4.1 



the similarity is determined using the Kullback 

informational divergence of the 

two histograms. In a second embodiment, the 

similarity is based on the entropy 

of the distribution of similarity coefficients of 

the two histograms is used. 

9 Claims, 5 Drawing figures 

Exemplary Claim Number: 1 

Number of Drawing Sheets: 4 
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Abstract Text - ABTX (1): 

In an image retrieval system, a database with a 
large number of images is 

searched to find one or more images meeting the 
specification of a user. This 
specification is given in the form of a query 
image. The system determines the 

similarity between the query image and a particular 
image from the database by 

comparing the color histograms of the two images. 
The histograms are treated 

as statistical distributions and the similarity is 

determined on the basis of 

an information theoretic measure of the 

distributions. In a first embodiment, 

the similarity is determined using the Kullback 

informational divergence of the 

two histograms. In a second embodiment, the 

similarity is based on the entropy 

of the distribution of similarity coefficients of 

the two histograms is used. 



Brief Summary Text - BSTX (7) : 
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An image retrieval system and a method as 
described above, are known from 

the article "Tools and Techniques for Color Image 
Retrieval " , John R. Smith and 

Shih-Fu Chang, Proc. SPIE — Int. Soc. Opt. Eng 
(USA) , Vol. 2670, pp. 

426-437. The image retrieval system includes a 
database with a large number of 

images. A user searching for a particular image 
specifies a query image as to 

how the retrieved image or images should lock like. 

Then the system compares 
the stored images with the query image and ranks 
the stored image according to 

their similarity with the query image. The ranking 
results are presented to 

the user who may retrieve one or more of the 

images . The comparison of the 

query image with a stored image to determine the 
similarity may be based on a 

number of features derived from the respective 
images. The article describes 

the usage of a color histogram as such a comparison 
feature. When using the 

RGB (Red, Green and Blue) representation of an 
image, a color histogram is 

computed by quantizing the colors within the image 
and counting the number of 

pixels of each color. To determine the similarity, 
a number of techniques are 

described to compare the two color histograms of 

the respective images. The 

histogram euclidean distance is a simple measure 
calculated by comparing 

identical bins in respective histograms. No 
cross-wise comparison is made 

between different bins which represent perceptually 
similar colors. 

Furthermore, techniques for determining a histogram 
intersection and techniques 

for determining a histogram quadratic distance are 
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described. As an 

alternative to the histogram techniques, a 
comparison technique based on color 
sets is described. In this technique the color of 
a pixel is compared with a 

predetermined threshold. If the color is below the 
threshold, the pixel does 

not become a member of the set and otherwise it 
does become a member. A 

disadvantage is that a large number of pixels, all 
below the threshold, will 

not contribute in the comparison in any way. 
Furthermore, there is no 

discrimination between values above the threshold. 
The prior art techniques 

for determining the similarity between the 
candidate image and the query image 
are complex to execute and/or are occasionally not 
adequate enough. 



Brief Summary Text - BSTX (10) : 

An embodiment of the image retrieval system 
according to the invention uses 

a Kullback informational divergence. The Kullback 
informational divergence is 

a measure for determining how different one 
statistical distribution is from 

another statistical distribution. The inventor has 
realized that a color 

histogram can be treated as a statistical 
distribution and that the Kullback 
informational divergence can be applied for 
comparing the candidate color 
histogram with the query color histogram . 
Experiments have shown that 

retrieval of images on the basis of a similarity 
obtained from applying the 

Kullback informational divergence on the respective 
color histograms gives very 

good results. Furthermore, the calculation of the 
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Kullback informational 

divergence requires less computational effort than 
the known techniques, which 

is very important since a large number of candidate 
images may need to be 
compared with the query image. 



Brief Summary Text - BSTX (12) : 

A further embodiment of the image retrieval 

system according to the 

invention compares two color histograms of 

respective regions of the candidate 

image with two color histograms of corresponding 

regions of the query image, 

the spatial information in the respective images 
being employed when 

determining the similarity. This improves the 
accuracy of determining the 

similarity between the candidate image and the 
query image and a better 

discrimination among the images in the database can 
be achieved. 



Detailed Description Text - DETX (2) : 

FIG. 1 schematically shows an image retrieval 

system according to the 

invention. The system 100 includes a database 102 
with a potentially large 

collection 104 of images. A purpose of such a 
system is to retrieve from the 

collection one or more images that match the wishes 
of a user of the system. 

Those wishes are specified via a query image 106, 
which the user can enter into 

the system via entry means 108. The entry unit may 
allow the user to compose 

the query image from a number of existing images or 
to create the query image 

from scratch. The system compares the query image 
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with the candidate images in 

the database and determines for each candidate 
image how similar it is to the 

query image. The system ranks the candidate images 
according to the 

established similarity. The system 100 compares 

images on the basis of their 

color histogram . To this end, the system includes 
first histogram unit 110 to 

determine a query color histogram 112 from the 
query image 106. The process of 
determining a color histogram from an image is 
explained in FIG. 3 below. The 

system also includes second histogram unit 114 to 
determine a candidate color 

histogram 116 from a particular candidate image 
118. The first histogram unit 

and the second histogram unit may be integrated 
into one histogram means, which 

can act on the query image for generating the query 
color histogram and on the 

particular candidate image for generating the 
candidate color histogram 

respectively. The system further includes 
determining unit 120 to determine a 
similarity 122 on the basis of the query color 
histogram 112 and the candidate 

color histogram 116. Based on the similarity, the 
system presents a ranking of 

the candidate image on a display 126. The user may 
select an image from this 

ranking which is retrieved from the database via 
retrieval means 124 for 

further processing. This further processing may 
include temporarily storing 

the image in a file 128 for further selection. 
This may be implemented as that 

the system retrieves a number of candidate images 

and stores these in the file 

128, from where the user makes a final selection as 
to which image is desired. 
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In such a way of working, the system makes a first 
selection from the large 

collection in the database 102 and the user selects 

the image or images from 

the much smaller collection in file 128. 



Detailed Description Text - DETX (24): 

In the embodiments of the image retrieval system 
described above, a single 

color histogram is made from the whole image. 

Because of this, the spatial 

information from the image is lost and the 

comparison of two images reflects 

only global similarity. For example if a user 

enters a query image with a sky 

at the top and sand at the bottom, the retrieved 
images are expected to have a 

mix of blue and beige, but not necessarily a sky 
and sand. A desirable result 

for the retrieved candidate images would be images 
with blue at the top and 

beige at the bottom. In order to achieve this 
result, a further embodiment of 

the system according to the invention determines a 
color histogram for a number 

of respective regions of the query image and 
compares these determined 

histograms with histograms of corresponding regions 
of the candidate image. 

The query image may be divided into regions using 
pre-fixed boundaries, e.g. 

the division of the image into a number of 
rectangles. Furthermore, the 

regions may be indicated manually by the user 
taking into account important 

objects in the query image. In this way, the user 
forces that a histogram is 

made for a region comprising the object of 

interest. The choice of the region 

size is important since it governs the emphasis 
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that is given to local 

information. In one extreme, the whole image is 
considered as a single region 

so that only global information is used for the 
comparison. In the other 

extreme, the region size matches the individual 
pixels. In one of the further 

embodiments of the retrieval system according to 
the invention, the images are 

divided into 4. times. 4 rectangular regions. 



Detailed Description Text - DETX (50) : 

FIG. 5 shows an overview of the method according 
to the invention. In a 

first step 502, a query image is obtained 
containing the wishes of the user. 

This image may be composed from existing images or 
may be sketched by the user, 

possibly on the basis of an existing image. Then 
in a second step 504, a query 

color histogram of the query image is determined. 
This query color histogram 

will be used in comparing the query image with 
candidate images from a 

database. In a third step 506, a candidate color 
histogram of one of such 

candidate images is obtained. Preferably this 

candidate color histogram has 

been prepared in advance at the moment the 

candidate image had been stored in 

the database. Then obtaining the candidate color 

histogram now, comes down to 

simply retrieving the histogram. Alternatively, 
the candidate color histogram 

could be created at this instant, i.e. at the time 
when it is needed. When the 

candidate color histogram has been obtained, the 
similarity between the query 

image and the candidate image is determined in a 
determining step 508. If in a 
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comparison step 510 it is ascertained that the 
images are similar enough, the 

particular candidate image is retrieved from the 
database in retrieval step 

512. The particular candidate image may be 
directly presented to the user or 
may be temporarily stored in a file for later 
inspection. Then in step 514 it 

is determined whether all candidates images in the 
database have been dealt 

with. If this is not the case, a candidate color 
histogram of a next candidate 

image is obtained in step 506 and the process is 
repeated for this next 
candidate image. 
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